82 research outputs found
Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring
In this paper we address the following question: Can we approximately sample
from a Bayesian posterior distribution if we are only allowed to touch a small
mini-batch of data-items for every sample we generate?. An algorithm based on
the Langevin equation with stochastic gradients (SGLD) was previously proposed
to solve this, but its mixing rate was slow. By leveraging the Bayesian Central
Limit Theorem, we extend the SGLD algorithm so that at high mixing rates it
will sample from a normal approximation of the posterior, while for slow mixing
rates it will mimic the behavior of SGLD with a pre-conditioner matrix. As a
bonus, the proposed algorithm is reminiscent of Fisher scoring (with stochastic
gradients) and as such an efficient optimizer during burn-in.Comment: Appears in Proceedings of the 29th International Conference on
Machine Learning (ICML 2012
Denoising Criterion for Variational Auto-Encoding Framework
Denoising autoencoders (DAE) are trained to reconstruct their clean inputs
with noise injected at the input level, while variational autoencoders (VAE)
are trained with noise injected in their stochastic hidden layer, with a
regularizer that encourages this noise injection. In this paper, we show that
injecting noise both in input and in the stochastic hidden layer can be
advantageous and we propose a modified variational lower bound as an improved
objective function in this setup. When input is corrupted, then the standard
VAE lower bound involves marginalizing the encoder conditional distribution
over the input noise, which makes the training criterion intractable. Instead,
we propose a modified training criterion which corresponds to a tractable bound
when input is corrupted. Experimentally, we find that the proposed denoising
variational autoencoder (DVAE) yields better average log-likelihood than the
VAE and the importance weighted autoencoder on the MNIST and Frey Face
datasets.Comment: ICLR conference submissio
Neural Block-Slot Representations
In this paper, we propose a novel object-centric representation, called
Block-Slot Representation. Unlike the conventional slot representation, the
Block-Slot Representation provides concept-level disentanglement within a slot.
A block-slot is constructed by composing a set of modular concept
representations, called blocks, generated from a learned memory of abstract
concept prototypes. We call this block-slot construction process Block-Slot
Attention. Block-Slot Attention facilitates the emergence of abstract concept
blocks within a slot such as color, position, and texture, without any
supervision. This brings the benefits of disentanglement into slots and the
representation becomes more interpretable. Similar to Slot Attention, this
mechanism can be used as a drop-in module in any arbitrary neural architecture.
In experiments, we show that our model disentangles object properties
significantly better than the previous methods, including complex textured
scenes. We also demonstrate the ability to compose novel scenes by composing
slots at the block-level
Object-Centric Slot Diffusion
The recent success of transformer-based image generative models in
object-centric learning highlights the importance of powerful image generators
for handling complex scenes. However, despite the high expressiveness of
diffusion models in image generation, their integration into object-centric
learning remains largely unexplored in this domain. In this paper, we explore
the feasibility and potential of integrating diffusion models into
object-centric learning and investigate the pros and cons of this approach. We
introduce Latent Slot Diffusion (LSD), a novel model that serves dual purposes:
it is the first object-centric learning model to replace conventional slot
decoders with a latent diffusion model conditioned on object slots, and it is
also the first unsupervised compositional conditional diffusion model that
operates without the need for supervised annotations like text. Through
experiments on various object-centric tasks, including the first application of
the FFHQ dataset in this field, we demonstrate that LSD significantly
outperforms state-of-the-art transformer-based decoders, particularly in more
complex scenes, and exhibits superior unsupervised compositional generation
quality. Project page is available at
$\href{https://latentslotdiffusion.github.io}{here}
Generating Factoid Questions With Recurrent Neural Networks: The 30M Factoid Question-Answer Corpus
Over the past decade, large-scale supervised learning corpora have enabled
machine learning researchers to make substantial advances. However, to this
date, there are no large-scale question-answer corpora available. In this paper
we present the 30M Factoid Question-Answer Corpus, an enormous question answer
pair corpus produced by applying a novel neural network architecture on the
knowledge base Freebase to transduce facts into natural language questions. The
produced question answer pairs are evaluated both by human evaluators and using
automatic evaluation metrics, including well-established machine translation
and sentence similarity metrics. Across all evaluation criteria the
question-generation model outperforms the competing template-based baseline.
Furthermore, when presented to human evaluators, the generated questions appear
comparable in quality to real human-generated questions.Comment: 13 pages, 1 figure, 7 table
- …